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ABSTRACT 

A study examined the effectiveness of English 
dictionaries in assisting in error correction by beginning learners 
of English as a second language. Lexical errors made in examinations 
were collected and coded by type, and the usefulness of the "Longman 
Active Study Dictionary", designed for ESL learners, in helping to 
correct the errors was analyzed. The results suggest systematic 
differences in error types made by students of different language 
backgrounds. It is proposed that (1) analysis of errors made bv 
language learners can be useful to dictionary writers in increasing 
the materials' effectiveness and (2) a particular dictionary can vary 
in effectiveness among different target language groups. Further 
study is both recommended and planned. (MSE) 
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1 Background 



The immediate stimulus for this reunrr ^« 

meeting of the British Association^ T ,* ff^ « iven at t,,e 1986 
Nesi Mesi presented a gge ^ " h ST? 1 " * 

at EFL learners were often ineffective ln !, ^hat dictionaries aimed 
either failed to prevent an abvinnl ' " tI,0t the entries Provided 
-ror by giving the iapVessio ?Zl it ^ " aCtUally rei »^ced an 
Clearly, if this cl L ls generalislble "h?* - USe ° f a Word « 
implications for dictionarv inl ,1 ' then it: has serious 
on a very small corpus of onl 36 ^ WaS based 

substantial would be needed to evaluate lit.,™ SOm f thin 8 much more 
asked by the Longman Group to coUecf t i lai " ,S She made ' We « ere 
a view to making a similar analysis „? a "5" C ° rpUS ° f errors with 
also to bear in mind the wider nossibifi^ wtionary effectiveness, but 
offer as a general research tool ' that SUCh a COr i )l,s 

2 The basic corpus 

From December 1986 to Jannarv 1QA7 

substantial collection* tTcll . ro s ^iT" ^ Codod a 
errors taken from a collection / n data cons ists of 13G4 

papers kindly provided by the Jambridae'lv Cert j ficate examination 
errors came from a total of 14 Rxa «*nations Syndicate. The 

of the data was taken from essavs nrnt 8 T P f * bUt a PP rox i-"ately 50% 
Spanish. m 6SSays P^uced by native speakers of 

»Mch recognised five dif ferentlieldsf These "re? °" t0 * dat """ 

a look-up word 

a short context 

a source language code 

an error type code 

a dictionary code 

in: co CI,eck what he had produced, i.e. 

He was fond to drink a lot 

the look-up word would bp frown 

to find the relevant infonnX u.^TtheT 1°"^ reasonaI,1 y expect 
under its accompanying prep on rL! h f d ad J ectiv o ratl le r than 
much context as necessarv to IT Ju Sh ° rt COnteXt consists of as 
indicated the LI oT ?he Larn^ wh ^7°^ ^ c °°e 
-ally inferable from Z^SJT^^ Eg? ^ = 
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examination took place. 

The error codes were based on a simple 6 point system: 

0 totally wrong word 

1 phonologically related word 

2 wrong word right semantic area 
j formal derivational errors 

4 usage 

5 spelling error 

Code 0 covers cases where the error Un r,i • • 

this code does not say why the P T 18 Just wron 2- Note that 

errors arising from LI SnsxerTex^^v'thr 0 " 8 ' S ° that " treatS 
error of this type. This may seem 2 i }l, Same " ay as an y otl, er 
justified on the grounds t e ? Y """^intuitive, but we 
language background Its user shares ^ d0eS n0t " kn ° W " what 
information into account * a " d 80 cannot take this 

Cot 2 ^h^^t"^ f Simil3r 

The learner has chosen " word f rom 1 e SZ* ^ 0r regiSter - 

not^uite right. The reasons for this can r ^ SGmantlc a ~ a "htch is 

constructed^ ^F^tt^'at essentia l ] y correct but wrongly 
Code 4 covers cfsefwhlS ^e woTt^ if™ 5 ' 
is incorrect, e.g. FOND TO Lstead of loND^ 0 " 6 ^' " Ut ±tS C °' lteXt 
the 6 ?nte 0 n V de r d ^ ^ *» - ™> other than 

example, LITE for IJGHT would not Z , i3n ° red - So « for 

LIGHT, or LIT for LICHT wou£ 1 be herG » but hm *>r 

IS C0 ^l^^ *» ^re is some 

belongs to, but on the Toll , a " y f artic «^r error 

more complicated system which att^l i,efctGr than a Inuc » 

distinctions. E xainp L 1 ^tro^^re^ovided in^?" 

^ d ^STS l^t^^ ^PP- if the writer 

specifically ai,ned~ a t EFL snelkero) n BiSSSefi a . (a dictionary 

wanted to write. BearL? it'^^d chatln J2 Wh0t s/ " G 
there are three possible logical 

outcomes* 1S erronoou s» 

a) the dictionary identifies tho nrmr a 

correct it. Grr ° r and s,,ows the user how to 

b) the dictionary identifies the error but failo , 

to correct it. or Duc £ails to show the user how 

c) the dictionary fails to identify the error. 
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TABLE 1 Examples of error codes 

0: completely wrong word: 
supply 

♦••you would supply to change it... 
piles 

...my tape-recorder had terrible piles... 

1: phonologically related word: 
punch 

...no sign of a punch on the tyre... 

2: wrong word from the right semantic area: 
tranquilise 
...my wife tranquilised me... 

3: formal errors 
amuse 

...there is an amusing arcade... 

4: usage 
access 

...he easily accessed drugs thanks to his money... 

5: spelling (where this resulted in another word) 
prize 

...the prize of the book was two pounds... 



TABLE 2 Dictionary codes 

C: ClG word entry ldentifies the error and P oi »ts to the correct 
Z: ^ad-end: entry identifies the error, but fails to offer any 

R: reference: as Z, but where reference to the correct word 

might have been expected 
R: example: as Z, but where an example of correct usage would 

have prevented the error 
P: Prefix: as Z, but where additional information about other 

related forms would have prevented the error 
X: invention: the word does not exist in the dictionary. 

terror 8 ' l0 ° kin8 " P ^ 6ntry W ° Uld ,10t help t0 avoU t,ie 
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In practice, this three point system seemed rather too restricted, 
and we eventually adopted a more complex 7 point system which as 
exlained in more detail in Table 2. 

Codes C and M correspond to cases (a) and (c) listed above. The 
remaining codes all correspond to particular instances of case (b). 
The paradigm of case (b) is Code Z, where the dictionary entry 
identifies the error, but fails to indicate what the writer should do 
about It. The other examples of this type, Codes R,E and P indicate 
cases where there is an obvious, systematic remedy for the failure of 
the dictionary to say how the error should be corrected, 

3 Analysis of the data 

The data has been analysed in two ways: a) the distribution of the 
errors on each of the codings separately, and b) interactions between 
pairs of codings. 

3. a The individual codings 

Figure la shows the basic distribution of errors according to error 
type. The largest component by far consists of semantically based 
errors (code 2), which account for almost half the entire corpus. The 
second largest component is made up of usage errors (code 4), which 
account for almost a quarter of the corpus. There are two othor 
sizable components, formal errors (code 3) and completely wrong words 
(code 1), which each account for 15% of the errors. Phohologically 
based errors and spelling errors which resulted in an incorrect word 
together account for only 8% of the total corpus. 

This distribution is quite interesting, and not entirely expected. 7n 
particular, the fact that semantically related errors, usage errors 
and tormal errors together account for almost 80% of the total errors 
is especially important, since these are precisely the sorts of 
errors that dictionaries ought to be capable of preventing. 

Figure lb shows the distribution of the errors according to 
dictionary code. There are three important things to note in this 
data Firstly, 33 j of the entries are perfectly satisfactory (Code 

M * rt^n y 'o?5 y . 6% u ° f the entrles are actually misleading (Code 
1). Thirdly, 2U of the entries are unsatisfactory, but could easily 
by improved by the addition of obvious extra information, (Codes R, E 
and P) This leaves a substantial number of entries which are 
unsatisfactory dead ends (Code Z). 35% of the entries fall into this 

C 2i e L 0ry, n ? Ut * c£ ' bel0W where this inclusion is qualified by 
additional data.) *-"m-»i uy 
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Since v;e have not compared this data with any other dictionary, it is 
not possible for us to say whether this distribution is good or' bad. 
It is certainly rather better than the figures quoted by Nesi, and an 
informal, subjective assessment suggests that the Active Study 
Dictionary comes out rather well. 

However, this analysis of the errors is a crude and superficial on*, 
and a more complex picture emerges if we look at how these global 
figures break down under closer examination. 

3.b Interactions between the codings 

Three interactions will be reported in this section: the interaction 
between error types and dictionary codes; the interaction between 
error types and source language; and the interaction between source 
language and dictionary codes. 

The interaction between error types and dictionary codes is reported 
in J able o. 



TABLE 3: 

Interactions between dictionary codes and error type 
(small numbers of cases shown as - for simplicity) 



d-code: C E M P R X 

e-type: 

0 

1 

2 101 - 78 - 134 

3 136 - - 22 

4 109 75 
5 



Z 



26 160 
24 53 
233 

12 



This table shows that our analysis of dictionary codes (Fi"ure lb) 
needs to be treated with considerable caution. The main point to 
emerge from this data is that dictionary codes are not distributed 
even y among the various error types. Clear entries (Code C) are 
mainly associated with semantic errors, formal errors and usar-e 
errors. Dead ends (Code Z) are principally associated with wholly 
wrong words and with semantically related errors. This distinction is 
important, of course. Basically, there is no reason why one would 
expect a dictionary to be able to handle a wholly incorrect word, so 
that though the 168 Code Z entries found with type 0 errors are 
unfortunate from the user's point of view, they can hardly be treated 
as a fundamental inadequacy in the dictionary. This means that we 
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need to revise our estimate of the importance of dead ends in the 
Active Study Dictionary: if we ignore these 168 entries, then there 
are only 208 real dead ends - 15% of the whole corpus. 

On the other hand, the 233 semantically related errors which evoked 
dead ends are obviously important. Two other important groups of 
errors also emerge from this analysis: the 78 semantically related 
errors that evoked misleading dictionary entries, and the 134 
semantically related errors that could have been sorted if the 
dictionary had referred to an obviously related word. These three 
combinations suggest that if the dictionary paid more attention to 
the relationships between words in the same semantic area, its 
efficiency could be increased considerably. Solving all three of 
these problems would have doubled the number of CLEAR entries, and 
reduced the total of unsatisfactory entries to minimal levels. 

The only other combination worth commenting on is the 75 errors which 
could have been avoided if an example of usage had been included in 
the dictionary entry. 

The interaction between error types and source language is shown in 
Table 4. As in the previous data, this table has been simplified to 
make the patterns stand out more clearly. 

This analysis reveals a number of interesting features. Basically , 
the proportion of different error types varies markedly from one 
language to another. Particularly noticeable is the high proportion 
of type 1 errors (phonologically related errors) for Chini.-se and 
Indonesian. In no other language does the proportion of such errors 
exceed 10%. Equally remarkable is the fact that Indonesian has a very 
small proportion of semantically related errors, whereas in all other 
languages in the sample this type of error accounts for at least 27% 
of the total. Less 'striking, but perhaps just as important are the 
variations in the other columns: some languages appear to have 
relatively small proportions of type 0 errors (completely wrong 
words) while for some languages these errors are relatively frequent; 
some languages give rise to relatively high proportions of type 3 
errors (formal errors), while in other cases these formal errors are 
relatively few. 

It is difficult to assess the importance of these distributions with 
any degree of certainty. The sample size for some languages .is very 
small, and there may be a high degree of sampling error involved. The 
figures for Spanish will be fairly reliable, however, since the total 
sample size for that language is large (546 items). In this case, 
the bulk of the errors are semantic errors (42%), with a further 37% 
of the total evenly divided between formal errors and wholly wrong 
words. No other language produces a pattern of errors which resembles 
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this ono vci-y closely, 

fro. different , )ackorou „„ s e ' ! , ,™ ™ 

studies o( errors have concentrated on errors collected Pr™ f owo, ,' s 

example of this, novever *o ' en « i^re™ is Et'lLSTf" 
« put together the data reported in labjes 3 aJd olSdlo^k at the 

■ S saws 

cney rise above 13*. but even tlus figure seems unacceptaMy high. 
3. Discussion 

2=JSJ Sac a-SrtjS hora^e slstl^.^t 

accept this critics, hat nonetheless! it 5 oil appear that U ' Tm rk 

more licht on tZZ . 'm'' lar8Gr 1,ro J oct which throw 

fiftoentp reed nerf" " X ~T ^^"^ 
certain] v ul V< ■ cori)US ° r t,,at order of magnitude could 

t-orcainiy not bo dismissed as neplicnhln nnH oi,«„i,i u 
important research tool for the Sal fear! 
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TABLE 4: Error types by source language 
figures show percentage of errors in each category. 
- indicates that less than 15% of errors fell in thi 



this category 



e-type: 
s-lang: 



French - - 40 - 

German - - 51 - 

Dutch _ - 34 15 

Spanish 18 - /,2 

Italian 27 - 49 

Swedish - - 34 25 

Norwegian _ 33 3*0 

Finnish _ 31 32 

Arabic 17 37 

Japanese _ _ /,g 

Chinese if, 24 29 27 

Greek 16 - 34 

Indonesian - 52 - - 

Swahili 25 - 27 . 16 



27 
19 
26 
19 



19 
22 
29 
27 

30 
31 
16 



TABLE 5: Dictionary Codes by~Source Language" = = 

fctt:; 8 uaVt4^: y ^ —** >~ 

- indicates that less than 10% of codings fell in this ,r,tegory. 



'I-code: C E M P R X 

s-lang: A 

French 33 11 10 - 11 

German 27 / 1 

Dutch 34 - 10 - 10 - — 



28 

15 - 32 



Spanish 30 

Italian 21 - - - 29 IX 

Swedish 34 - _ _ !5 00 

Norwegian 34 - 11 - 15 93 

Finnish 45 - 13 ,« 

Arabic 44 - - - 13 - 31 

Japanese 41 - _ _ _ ~ 30 

Chinese 25 - - _ I ~ ro 

??* . 48 - - - 10 I 24 

Indonesian 48 12 12 - 9/ 

Swahili 25 ~- - _ I : £ 
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